8 research outputs found
Direct Sparse Visual-Inertial Odometry using Dynamic Marginalization
We present VI-DSO, a novel approach for visual-inertial odometry, which
jointly estimates camera poses and sparse scene geometry by minimizing
photometric and IMU measurement errors in a combined energy functional. The
visual part of the system performs a bundle-adjustment like optimization on a
sparse set of points, but unlike key-point based systems it directly minimizes
a photometric error. This makes it possible for the system to track not only
corners, but any pixels with large enough intensity gradients. IMU information
is accumulated between several frames using measurement preintegration, and is
inserted into the optimization as an additional constraint between keyframes.
We explicitly include scale and gravity direction into our model and jointly
optimize them together with other variables such as poses. As the scale is
often not immediately observable using IMU data this allows us to initialize
our visual-inertial system with an arbitrary scale instead of having to delay
the initialization until everything is observable. We perform partial
marginalization of old variables so that updates can be computed in a
reasonable time. In order to keep the system consistent we propose a novel
strategy which we call "dynamic marginalization". This technique allows us to
use partial marginalization even in cases where the initial scale estimate is
far from the optimum. We evaluate our method on the challenging EuRoC dataset,
showing that VI-DSO outperforms the state of the art
From Monocular SLAM to Autonomous Drone Exploration
Micro aerial vehicles (MAVs) are strongly limited in their payload and power
capacity. In order to implement autonomous navigation, algorithms are therefore
desirable that use sensory equipment that is as small, low-weight, and
low-power consuming as possible. In this paper, we propose a method for
autonomous MAV navigation and exploration using a low-cost consumer-grade
quadrocopter equipped with a monocular camera. Our vision-based navigation
system builds on LSD-SLAM which estimates the MAV trajectory and a semi-dense
reconstruction of the environment in real-time. Since LSD-SLAM only determines
depth at high gradient pixels, texture-less areas are not directly observed so
that previous exploration methods that assume dense map information cannot
directly be applied. We propose an obstacle mapping and exploration approach
that takes the properties of our semi-dense monocular SLAM system into account.
In experiments, we demonstrate our vision-based autonomous navigation and
exploration system with a Parrot Bebop MAV
Rolling-Shutter Modelling for Direct Visual-Inertial Odometry
We present a direct visual-inertial odometry (VIO) method which estimates the
motion of the sensor setup and sparse 3D geometry of the environment based on
measurements from a rolling-shutter camera and an inertial measurement unit
(IMU).
The visual part of the system performs a photometric bundle adjustment on a
sparse set of points. This direct approach does not extract feature points and
is able to track not only corners, but any pixels with sufficient gradient
magnitude. Neglecting rolling-shutter effects in the visual part severely
degrades accuracy and robustness of the system. In this paper, we incorporate a
rolling-shutter model into the photometric bundle adjustment that estimates a
set of recent keyframe poses and the inverse depth of a sparse set of points.
IMU information is accumulated between several frames using measurement
preintegration, and is inserted into the optimization as an additional
constraint between selected keyframes. For every keyframe we estimate not only
the pose but also velocity and biases to correct the IMU measurements. Unlike
systems with global-shutter cameras, we use both IMU measurements and
rolling-shutter effects of the camera to estimate velocity and biases for every
state.
Last, we evaluate our system on a novel dataset that contains global-shutter
and rolling-shutter images, IMU data and ground-truth poses for ten different
sequences, which we make publicly available. Evaluation shows that the proposed
method outperforms a system where rolling shutter is not modelled and achieves
similar accuracy to the global-shutter method on global-shutter data
MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
In this paper, we propose MonoRec, a semi-supervised monocular dense
reconstruction architecture that predicts depth maps from a single moving
camera in dynamic environments. MonoRec is based on a MVS setting which encodes
the information of multiple consecutive images in a cost volume. To deal with
dynamic objects in the scene, we introduce a MaskModule that predicts moving
object masks by leveraging the photometric inconsistencies encoded in the cost
volumes. Unlike other MVS methods, MonoRec is able to predict accurate depths
for both static and moving objects by leveraging the predicted masks.
Furthermore, we present a novel multi-stage training scheme with a
semi-supervised loss formulation that does not require LiDAR depth values. We
carefully evaluate MonoRec on the KITTI dataset and show that it achieves
state-of-the-art performance compared to both multi-view and single-view
methods. With the model trained on KITTI, we further demonstrate that MonoRec
is able to generalize well to both the Oxford RobotCar dataset and the more
challenging TUM-Mono dataset recorded by a handheld camera. Training code and
pre-trained model will be published soon.Comment: Project page with video can be found under
https://vision.in.tum.de/research/monorec . 14 pages, 10 figures, 5 table
Omnidirectional DSO: Direct Sparse Odometry with Fisheye Cameras
We propose a novel real-time direct monocular visual odometry for
omnidirectional cameras. Our method extends direct sparse odometry (DSO) by
using the unified omnidirectional model as a projection function, which can be
applied to fisheye cameras with a field-of-view (FoV) well above 180 degrees.
This formulation allows for using the full area of the input image even with
strong distortion, while most existing visual odometry methods can only use a
rectified and cropped part of it. Model parameters within an active keyframe
window are jointly optimized, including the intrinsic/extrinsic camera
parameters, 3D position of points, and affine brightness parameters. Thanks to
the wide FoV, image overlap between frames becomes bigger and points are more
spatially distributed. Our results demonstrate that our method provides
increased accuracy and robustness over state-of-the-art visual odometry
algorithms.Comment: Accepted by IEEE Robotics and Automation Letters (RA-L), 2018 and
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
201